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Correlation  in  Multiversion  Software 


Toke  Jayachandran 
Code  MA/Jy 

Naval  Postgraduate  School 
Monterey,  CA  93943 

September,  1996 


ABSTRACT 

It  has  been  established  both  theoretically  [1]  and  experimentally  [2],  that  independently 
developed  redundant  software  versions  fail  dependently.  Several  probability  models  that 
account  for  this  phenomenon  of  concurrent  failures  have  appeared  in  the  literature.  Tomek  et 
ai,  [3]  proposed  an  intensity  distribution  that  introduced  a  specific  type  of  correlated  failure 
pattern  viz.,  pairwise  correlation  between  software  modules.  They  derived  the  intensity  pmf 
for  N  =  2  and  3  modules  and  indicated  the  desirabihty  of  an  efficient  algorithm  to  compute 
the  pmf  for  larger  values  of  N.  This  paper  contains  an  easily  programmable  algorithm  to 
generate  the  pmf  for  any  choice  of  N. 


1  INTRODUCTION 

The  two  principal  techniques  for  software  redundancy  are  N-version  programming  [4]  and 
the  recovery  blocks  [5].  Both  require  multiple  independently  developed  software  versions  to 
achieve  high  reliability  in  software  ^sterns.  Initially,  it  was  believed  that  failures  in  inde¬ 
pendently  produced  software  occui^ependently;  and  system  reliability  computations  were 
based  on  this  premise.  It  was  subsequently  demonstrated,  both  theoretically  [1]  and  experi¬ 
mentally  [2]  that  multiple  versions  can  fail  simultaneously  for  some  choices  of  inputs.  As  a 
result,  reliability  estimates  assuming  independent  failures  can  be  overly  optimistic.  Several 
papers  introducing  probability  models  that  allow  for  concurrent  failures  have  appeared  in 
the  literature  recently.  Nicola  and  Goyal  [6]  proposed  a  model  for  simultaneous  failure  of 
independent  software  modules  and  the  model  has  been  shown  to  provide  a  good  fit  to  the 
experimental  data  in  [2].  Tomek  et  al,  [3]  introduced  another  model  for  generating  the 
probability  distribution  (intensity  distribution)  of  the  number  of  modules  (in  an  N-version 
system)  that  fail  concurrently  for  a  randomly  selected  input.  The  latter  model  incorporates 
the  correlated  failure  syndrome  into  the  intensity  pmf  through  a  parameter  K  that  represents 
the  probability  that  a  pair  of  modules  will  produce  identical  outputs.  They  derived  explicit 


expressions  for  the  pmf  for  N=2  and  N=3  module  software  systems,  and  suggested  that  an 
efficient  algorithm  is  needed  to  derive  the  pmf  for  larger  values  of  N.  This  paper  presents 
such  an  algorithm  for  generating  the  intensity  pmf  for  different  choices  of  the  parameters  N 
and  K.  The  algorithm  is  easily  programmable  and  is  particularly  suited  for  use  with  symbolic 
computation  packages  such  as  MAPLE® 

The  Tomek  et  al,  [3]  model  for  correlated  failure  is  described  in  Section  II  and  the 
algorithm  for  deriving  the  intensity  pmf  for  chosen  values  of  N  and  K  is  presented  in  Section 
III.  A  MAPLE  program  for  generating  the  pmf  and  the  output  of  the  program  for  N=5  and 
K=.l  are  included  in  the  Appendix. 

2  A  PROBABILITY  MODEL  FOR  CORRELATED 
FAILURES 


Consider  a  redundant  software  system  with  N  independently  developed  modules.  Let  ©Ar(X) 
be  the  proportion  of  modules  (out  of  N)  that  fail  (produce  an  incorrect  output)  for  a  randomly 
chosen  input  X.  Then  ©Ar(A)  is  a  random  variable  assuming  the  values  {0,  l/N ,  2/iV 
Qn{X)  is  called  the  intensity  function  and  its  probability  distribution  is  referred  to  as  the 
intensity  distribution.  For  their  probabihty  based  correlated  failures  model,  Tomek  et  al. 
[3]  assume  that  for  each  pair  of  modules,  a  proportion  K  of  all  possible  inputs,  will  always 
generate  identical  outputs  for  the  two  modules.  It  is  possible  for  two  different  pairs  of 
modules  to  have  identical  inputs  on  two  different  sets  of  inputs,  albeit  the  proportion  of 
such  inputs  K  is  the  same  for  all  pairs.  It  is  further  assumed  that  a  module  will  produce 
an  incorrect  output  with  probability  p.  For  N=2  modules,  the  space  of  all  possible  inputs 
is  comprised  of  two  subsets  R  and  its  complement  R' .  R  is  the  set  of  inputs  for  which  the 
two  modules  will  produce  identical  results,  and  for  inputs  from  R!  the  module  outputs  are 
independent.  The  intensity  function  ©2(A’)  assumes  the  values  0,  1/2,  1  and 

'  =  0]  =  Pr[A'eR].Pr[both  module  outputs  are  correctlXeR] 

+Pr[A'eR'].Pr[both  modules  outputs  are  correct! A eR'] 

P7'[©2(A)  =  1/2]  =  Pr[A€R'].Pr [exactly  one  output  is  correctjAeR']  ^  ^ 

=  2(l-ii:)p(l-p); 

Pr[©2(A)  =  1]  =  Pr[AeR].P[both  module  outputs  are  correct] AeR] 

+Pr[AeR'].P[both  module  outputs  are  correctjAeR'j 
^  =  Ap  -I-  (1  -  K)p^\ 

^  MAPLE  is  a  registered  trademark  of  Waterloo  Maple  Software 
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In  the  case  of  N  =  S  modules,  the  input  space  is  partitioned  into  3  types  of  subsets 
Ri,  i?2,  and  Rz  where  Ri  (i  =  2,3)  is  the  set  of  inputs  for  which  exactly  i  modules 
will  produce  identical  results;  Ri  is  the  set  of  inputs  for  which  the  module  outputs  are 
independent.  There  will  be  three  subsets  of  the  type  R2  and  just  one  subset  each  of  the 
types  ill  and  Rz-  The  probabilities  for  the  selection  of  an  input  from  these  subsets  are  K'^ 
for  Rz,  3K(1  -  K)  for  R2  &nd  I  -  -  ZK {1  -  K)  ^  {I  -  K)il  -  2K)  for  ili  and 

Pr[©3(X)  =  j/3]  =  SjPrfX  eiljj.Prfexactly  j  outputs  are  correct]  X  eRi]  j  =  0..  .3. 

Therefore 

Pr[@ziX)  =  0]  =  K^{1  -p)+ZKil-  A:)(1  -  pf  +  2K){1  -  pf-, 

Pr[©3(X)  =  1/3]  =  =  K\0  +  ZK{\  -  K)p{l  -pf  +  {l-K){l-  2K)p{l  -  pf-, 

(2) 

Pr[©3(X)  =  2/3]  =  =  K'^.d  +  3A:(1  -  i!:)p(l  -  p)  +  (1  -  i!:)(l  -  2K)p\l  -  p); 

Pr[ez{X)  =  l]==K^p  +  ZK{1  -  K)p‘^  +  (1  -  K){1  -  2K)p\ 

The  calculation  of  the  probabilities  of  selecting  an  input  from  the  subsets  partitioning 
the  input  space,  and  the  conditional  pmf  of  On{X)  becomes  increasingly  more  difficult  as 
the  number  of  modules  N  increases.  An  efficient  algorithm  that  will  perform  the  needed 
book  keeping  in  a  systematic  fashion  is  presented  in  the  next  section. 

3  AN  ALGORITHM  FOR  GENERATING  THE  IN¬ 
TENSITY  DISTRIBUTION 

For  an  N  module  software  system,  the  input  space  is  partitioned  in  N  types  of  subsets 
Ri,  i  =  I,  2, . . .  ,N.  Inputs  from  subset  type  Ri  will  result  in  identical  outputs  from  i  of  the 

N  modules.  The  number  of  subsets  of  type  Ri,  except  for  type  Ri,  is  equal  to  ^  the  number 

of  different  ways  of  selecting  i  modules  from  the  available  N  modules.  There  is  just  one  subset 
of  type  Ri  and  the  module  outputs  are  independent  for  inputs  from  this  subset.  The  table 
below  illustrates  the  pattern  for  the  conditional  probabilities  Pr[©jv(A)  =  j/N\X  e  ilj]  when 
N  =  5. 
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TABLE  1 


(l-p)'‘  3p(l-p)^  3p^(l -p)^+p(l-p)^  P^(l  -  p)  +  3p^(l  -  p)^  3p®(l-p)L  P^ 


■ 


(1-pf  5p{i-py 


lOpfl  -p) 


10p^(l  -  pf 


5p'^(l  -  p)  p' 


The  probability  entries  in  the  table  constitute  a  5  x  6  matrix  P  which  can  be  expressed 
as  the  sum  A  +  B  of  the  two  triangular  matrices 


■(1-p)  0  0  0  0  0 

(l-p)2  p(l-p)  0  0  0  0 

A=  (1  -  p)^  2p(l  -  p)^  p^(l  -  p)  0  0  0 

(l-p)‘*  3p(l-p)®  3p^(l-p)^  p^(l-p)  0  0 

.(1-p)®  5p(l-p)^  10p2(l-p)3  10p®(l-p)2  5p"(l-p)  p® 

and 


•  0 

0 

0 

0 

0 

P 

0 

0 

0 

0 

p(l  -  p) 

P^ 

B  = 

0 

0 

0 

p(l  -p)2 

2p2(l  -  p) 

P^ 

0 

0 

p(l  -p)3 

3p2(l  -p)2 

3p®(l-p) 

P^ 

< 

.  0 

0 

0 

0 

0 

0 

The  above  pattern  persists  for  all  N  and  the  two  N  x  N  +1  matrices,  in  the  general  case, 
have  the  form  A  =  (aij)  and  B  =  (6y)  where 
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(1  -  for  j  >  i  =  1,  2, . . . ,  AT  -  1 

(1  —  for  i  =  N 

otherwise 


i+j-N-i  _  p^N+i-j  for  j  >N-i  =  l,  2,. 

(5) 

0  otherwise 

The  entries  in  the  matrix  P  =  (a^  +  bij)  are  the  conditional  probabilities  Pr[@N  = 
(j  —  l)/iVlXei2i],  i  =  1, . . . ,  N  and  j  =  I, . . . ,  N  +  1.  The  unconditional  probabilities  or  the 
intensity  distribution  is  obtained  by  multiplying  the  matrix  P  on  the  left  by  the  l-row  matrix 
Q  =  [9i,  92,  •••,  9n]  where 

Qi  =  -  KY  for  i  =  1,  2,...,  AT-l  (6) 


Clij  —  < 


i-l 
j  -  1 

N 

J  -  1 


0 


,0-1) 


,0-1) 


9N  =  1  -  ^  9»- 
i 


(7) 


Note  that  Q  is  just  the  vector  of  probabilities  for  an  input  to  be  in  each  of  the  subset 
types  Rpf  y  •  •  • , 

The  algorithm  for  computing  the  intensity  pmf  of  ©Ar(X)  can  be  described  by  the  fol¬ 
lowing  3-step  process.  For  specified  values  of  the  parameters  N  and  K 

1.  Determine  the  1-row  matrix  Q  of  input  probabilities. 

2.  Evaluate  the  matrix  P  =  A  +  B,  the  matrix  of  conditional  probabilities, 

Pries  =  {j-l)/N\XeRi]. 

3.  Compute  the  matrix  product  Q  x  P  which  is  a  1  x  N  -f  1  matrix  to  obtain  the 
intensity  pmf  of  ©;v^(Ar). 
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APPENDIX 

Three  MAPLE  procedures  r(N,K),  Q(N,  K)  and  P(N,  K)  to  generate  the  matrix  Q  in 
(6)  and  (7)  and  the  matrix  P  =  A  -H  B  in  (3)  are  shown  below.  A  printout  of  the  MAPLE 
output  creating  these  procedures  and  the  computational  results  for  N  =5  and  K  =  .1  is  also 
included. 

r:  =  (N,K)  -  >  1  -  sum  ((binomial(n,i))*  (K^(N-l-i))*(l  -  K)^  i),  i  =  0..N-2); 

Q;  =  (N,K)  ->  array  ([(seq(binomial(N,i)*A:^(N  -l-i)*(l  -  K)^  i,  i  =  0..n-2),  r(n,k))]); 

P:  =  proc(N,  K)  local  A,B)C,s,t; 

A:  =  array  (1..N,  1..N+1): 
for  s  to  N  do 
for  t  to  N  +  1  do 
if  s  <  N  then 
if  s  >  t  - 1  then 

A[s,t]:  =  (binomial(s  -  1,  t  -  1)*  (K^(t-l))*  (1  -K)'^(s  - 1  -1)) 
else  A[s,  t]:  =  0  fi; 

else  A[s,t]:  =  (binomial(n,t-l)*(K^(t-l))*  (1  -  K)^(N-t-l));  fi;  od; 
od; 

B:  =  array(l..N,  1..N  -I-  1): 

for  s  to  N  do 

for  t  to  N  +  1  do 

if  s  <  N  then 

if  t  >  N  -  s  +  1  then 

B[s,t]:  =  (binomial(s  -  1,N+1  -  t)*  K^(s  +  t-N-1)*  ((1  -  K)^(N  +1  -t)) 
else  B[s,  t]:  =  0  fi; 
else  B[s,  t]:  =  0  fi;  od; 
od; 

C;  =  evalm  (A  +  B); 
end; 

Finally,  the  MAPLE  expression 

evalm  (Q(N,  K)k*  P(N,  K));  will  display  the  desired  intensity  pmf. 
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>  r:  =  (N,K)->l-sum((binoirdal(N,i))*{KMN-l-i))*((l-K)^i),i=0..N-2); 


r:=(N,K)^\- 


binomiaU N,  i )  {I  -  K)' 


>  Q;  =  (N,K) ->array(  [  (seq(biiioinial(N,  i)  (N-l-i)  (1  K)  0..N  2), 

"yi  (N,  K)  arrayc  I seq(binomial(N, ,' )  K*"- '  ( 1  -  N -  2),  r( N,  K) ] ) 

>  P:=proc(N,K)  local  A/B#C»s,t; 

>  A:=array(l..N,l..N+l): 

for  s  to  N  do 

>  for  t  to  N+1  do 
if  s<N  then 

»  ein  £i>  od, 

>  od; 

>  B:=array(l. .N, 1. .N+1) : 

>  for  s  to  N  do 

>  for  t  to  N+1  do 
if  s<N  then 

^  )  .  (K*  (..t-N-l)  )*  ( (1-K)  «  ) 

>  else  B[s,t] :=0  fi; 

>  else  B[s,tl:=0;  £i;  od; 

>  od; 

C;=evalm(A+B) ; 
end; 

P  ;=  proc(A^,  K) 
local  A,  B,  C,  s,  t\ 

A  ;=  array(  \  N,  I  ..  N  +  1)', 

for  j  to  yv  do  for ;  to  +  1  do 
if  s<N  then 

if ;  -  1  <  5  then 

A[s,  t]  ;=  binomial(5  -  1,  r  -  1  )*K^(  r  -  1  )*(  1  -  K^is- 1+\) 
elseA[s,  t]  ;=0 

else  A[s,  t]  :=  binomial(M  ;  -  1  )*K^i r  -  1  )*(  1  -  K^N -  r  +  1 ) 

fi 
od 
od; 

B  :=  arrayC  1  ..  N,  \  ..  N+  \ 

for  s  to  A/  do  for  ;  to  N  +  1  do 
if  5  <  N  then 

if  N  -  5  +  1  <  f  then  5[5,  r]  := 

binomial(5  -  1,  iV- f  +  1  ^ 1  ^  ^  ^  ^ 

elseB[5,  :=0 
fl 
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else  B[s,t]  :=  0 

fi 
od 
od; 

C  :=  evalm(A  +B) 

end 

>  r(5,.l); 

.1854000000 

>  Q(5,.l); 

[.0001,  .0045,  .0810,  .7290,  .1854000000] 

>  P(5,.l); 

.9  0  0  0  0  .1 

.81  .09  0  0  .09  .01 

.729  .162  .009  .081  .018  .001 

.6561  .2187  .0972  .0252  .0027  .0001 

..59049  .32805  .07290  .00810  .00045  .00001 

>  evalin(Q(5,  .1)  &*  P(5,.l)); 

[.6505577460,  .2337797700,  .08510346000,  .02643354000,  .003914730000,  .0002107540000] 
C  > 
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